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US Government Rights 

This invention was made with United States Government support under 
5 Grant Nos. CA 90815, CA57653 and F32 CA72166, awarded by National Institutes of 
Health. The United States Government has certain rights in the invention. 

Related Applications 

This application claims priority under 35 USC §1 19(e) to US Provisional 
10 Application Serial No. 60/484,077, filed July 1, 2003, the disclosure of which is 
incorporated herein by reference. 

Background 

The mammalian immune system has evolved a variety of mechanisms to 
protect theliost from cancerous cells. An important component of this response is 
mediated by cells referred to as T cells. Cytotoxic T lymphocytes (CTL) are specialized 
T cells that primarily function by recognizing and killing cancerous cells or infected cells, 
but they can also function by secreting soluble molecules referred to as cytokines that can' 
mediate a variety of effects on the immune system. T helper cells primarily function by 
recognizing antigen on specialized antigen presenting cells, and in turn secreting 
cytokines that activate B cells, T cells, and macrophages. 

A variety of evidence suggests that immunotherapy designed to stimulate 
a tumor-specific CTL response would be effective in controlling cancer. For example, it 
has been shown that human CTL recognize sarcomas (Slovin et al., 1986, J Immunol 137, 
3042-3048), renal cell carcinomas (Schendel et al., 1993, J Immunol 151, 4209-4220), 
colorectal carcinomas (Jacob et al., 1997, Int J Cancer 71, 325-332), ovarian carcinomas 
(Peoples et al., 1993, Surgery 1 14, 227-234), pancreatic carcinomas (Peiper et al., 1997, 
Eur J Immunol 27, 1115-1 123), squamous tumors of the head and neck (Yasumura et al., 

1993, Cancer Res 53, 1461-1468), and squamous carcinomas of the lung (Slingluff et al., 

1994, Cancer Res 54, 273 1-2737; Yoshino et al., 1994, Cancer Res 54, 3387-3390). The' 
largest number of reports of human tumor-reactive CTLs, however, has concerned 
melanomas (Boon et al., 1994, Annu Rev Immunol 12, 337-365). The ability of tumor- 
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specific CTL to mediate turner regression, in both human (Parmiani et al., 2002, J Natl 
Cancer mat 94, 805-8.8; Weber, 2002, Carteer fnves, 20, 208-221) and animal modek 
suggeste tha, memoes directed a, increasing CTL activity woufd likely have a benefieial 
effect with respect to tumor treatment 

Melanoma, or skin cancer, ia a disease that is diagnosed in approximately 
54,200 person, per year. Conventions, merapy for *e disease includes surgery, radiation 

7 , ^ ^ ° ftheSe t0 «»— * approximately 

7,600 mdrnduab die in the United States every year due to metanoma. Overall the 5- 
year survival rate for the disease is 88%. The survive, rate drops, however, in more 
advanced sfcges of the disease with on.y about 50% of Stage In patients, and 20-30% of 
StagerVpahentssurvivingpastfiveyears. In patients where the melamtma has 
metastesized to distent site s, me 5-year survive, dips to only 12%. dearly, mere is a 
populahon of melanoma patients that is in need of better treatment options. More 
recently, in an attempt to decrease the number of deaths attributed to melanoma, 
tmmunotherapy has been added to the arsenal of treataems used agains. me disease 

,„ r-rr " ^ '° " *» * «— 

meCTLmu S tfirst re cogniaetheca„cercelI(TownsendandBodmer,1989) This 
process involves the interaction of me T cel. receptee, .ocated on me surface of the CTL 
wtth what is genericaUy referred to as an MHC-peptide complex which is located on me' 
surface of me cancerous cell. MHC (major hia.ocompatibi.Hy^om p ,ex>e„coded 
motecu.es have been subdivided into two types, and are referred to as Cass I and Cass H 
MHC-encoded molecu.es. In the human immune system, MHC mo.ecules are referred ,o 
as human leukocyte antigens (HLA). Within the MHC comptex, .ocated on chromosome 
socarethreediffemutlocimatencodeibrclassIMHCmolecules. MHCmo.eou.es 
encoded a, these .oci are referred to as HLA-A, HLA-B, and HLA-C. The genes ma. can 
he encoded a. each of these .oci are extremely polymorphic, and thus, different 
mdividuala within the popuhuWxpress different class I MHC mo.ecu.ea on me surface 
of men cell, HLA-A,, HLA-A2, HLA-A3, HLA-B7; and HLA-B8 are examples of 
Afferent class I MHC molecules that can be expressed fiom mese loci. The present 
dtsclosure involves peptides that are associated with the HLA-A3 molecule. 

The peptides which associate with the MHC molecules can either be 
denved from protems made within the cell, in which case they typicalty associate with 
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class I MHC molecules (Rock and Goldberg, 1999, Annu Rev Immunol 17, 739-779); or 
they can be derived from proteins which are acquired from outside of the cell, in which 
case they typically associate with class H MHC molecules (Watts, 1997, Annu Rev 
Immunol 15, 821-850). The peptides that evoke a cancer-specific CTL response most 
typically associate with class I MHC molecules. The peptides themselves are typically 
nine amino acids in length, but can vary from a minimum length of eight amino acids to a 
maximum of twelve amino acids in length. Tumor antigens may also bind to class II 
MHC molecules on antigen presenting ceils and provoke a T helper cell response. The 
peptides that bind to class II MHC molecules are generally twelve to nineteen amino 
acids in length, but can be as short as ten amino acids and as long as thirty amino acids. 

The process by which intact proteins are degraded into peptides is referred 
to as antigen processing. Two major pathways of antigen processing occur within cells 
(Rock and Goldberg, 1999, Annu Rev Immunol 17, 739-779). One pathway, which is 
largely restricted to cells that are antigen presenting cells such as dendritic cells, 
macrophages, and B cells, degrades proteins that are typically phagocytosed or 
endocytosed into the cell. Peptides derived in this pathway typically bind to class II 
MHC molecules. A second pathway of antigen processing is present in essentially all 
cells of the body. This second pathway primarily degrades proteins that are made within 
the cells, and the peptides derived from this pathway primarily bind to class I MHC 
molecules. Antigen processing by this latter pathway involves polypeptide synthesis and 
proteolysis in the cytoplasm, followed by transport of peptides to the plasma membrane 
for presentation. These peptides, initially being transported into the endoplasmic 
reticulum of the cell, become associated with newly synthesized class I MHC molecules 
and the resulting complexes are then transported to the cell surface. Peptides derived 
from membrane and secreted proteins have also been identified. In some cases these 
peptides correspond to the signal sequence of the proteins which is cleaved from the 
protein by the signal peptidase. In other cases, it is thought that some fraction of the 
membrane and secreted proteins are transported from the endoplasmic reticulum into the 
cytoplasm where processing subsequently occurs. 

Once bound to the class I MHC molecule, the peptides are recognized by 
antigen-specific receptors on CTL. Several methods have been developed to identify the 
peptides recognized by CTL, each method of which relies on the ability of a CTL to 
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recognize and kill only those cells expressing the appropriate class I MHC molecule with 
the peptide bound to it. Mere expression of the class I MHC molecule is insufficient to 
trigger the CTL to kill the target cell if the antigenic peptide is not bound to the class I 
MHC molecule. Such peptides can be derived from a non-self source, such as a pathogen 
(for example, following the infection of a cell by a bacterium or a virus) or from a self- 
derived protein within a cell, such as a cancerous cell. The tumor antigens from which 
the peptides are derived can broadly be categorized as differentiation antigens, 
cancer/testis antigens, mutated gene products, widely expressed proteins, and viral 
antigens (Castelli et al., 2000, J Cell Physiol 182, 323-331). 

Immunization with melanoma-derived, class I or class II MHC-encoded 
molecule associated peptides, or with a precursor polypeptide or protein that contains the 
peptide, or with a gene that encodes a polypeptide or protein containing the peptide, are 
forms of immunotherapy that can be employed in the treatment of melanoma. This form 
of immunotherapy requires that immunogens be identified so that they can be formulated 
into an appropriate vaccine. Although a variety of melanoma-derived antigens have been 
identified (Castelli et al., 2000, J Ceil Physiol 182, 323-331; Rosenberg, 1999, Immunity 
10, 281-287; Van den Eynde and van der Bruggen, 1997, Curr Opin Immunol 9, 684-693; 
Wang and Rosenberg, 1999, Immunol Rev 170, 85-100), not all of these are appropriate ' 
for broad-based immunotherapy as the expression of some of them is limited to the tumor 
derived from a specific patient Furthermore, the number of MHC molecules from which 
tumor-derived peptides have been discovered is relatively limited, and largely restricted 
to HLA-A2. Thus, it would be useful to identify additional peptides that complex with 
class I MHC molecules other than HLA-A2. Such peptides would be particularly useful 
in the treatment of melanoma patients that do not express the HLA-A2 molecule. 

It is also particularly useful to identify antigenic peptides that are derived 
from different parent proteins, even if the derived peptides associate with the same class I 
MHC molecule. Because an active immune response can result in the outgrowth of 
tumor cells that have lost the expression of a particular precursor protein for a given 
antigenic peptide, it is advantageous to stimulate an immune response against peptides 
derived from more than one parent protein, as the chances of the tumor cell losing the 
expression of both proteins is the multiple of the chances of losing each of the individual 
proteins. 



10 



WO 2005/005612 

PCT/US2004/021168 



The present invention relates to genes, proteins, and peptides that may be used in 
the diagnosis and treatment of cancer, and in one embodiment the treatment of 
melanoma. More specifically, the invention relates to the isolation and purification of 
two novel tumor antigens that can be used as tools for the diagnosis, prevention, and 
treatment of cancer; and to DNA sequences that code the precursor proteins from which 
the tumor antigens are derived. 

Summary of Various Embodiments of the Invention 

The present invention is directed to a newly discovered gene family with 
multiple isoforms, designated TAG-1 (SEQ ID NO: 1); TAG-2a (SEQ ID NO: 2); TAG- 
2b (SEQ ID NO: 3); TAG-2c (SEQ ID NO: 4); and TAG-3 (SEQ ID NO: 5), proteins 
encoded by such nucleic acid sequences, and antibodies generated against said proteins. 
TAG-1, TAG-2a, TAG-2b, and TAG-2c all code for one or more proteins that can give 
rise to the antigenic peptide RLSNRLLLR (SEQ ID NO: 12): The RLSNRLLLR (SEQ 
ID NO: 12) peptide binds to the class I MHC molecule, HLA-A3, and is recognized by 
melanoma-specific CTL. The genes, proteins, and peptides described herein may be used 
as diagnostic indicators of the presence of cancer and/or used in therapeutics to treat 
cancer. 

20 Brief Description of the Drawings 

Fig. 1. The mRNA sequence and deduced protein sequence for each of the 
TAG genes coding for the RLSNRLLLR (SEQ ED NO: 12) peptide. The three potential 
nonstandard initiation codons that are in frame with the open reading frame coding the 
RLSNRLLLR (SEQ ID NO: 12) peptide are underlined, The shaded nucleotide sequence 
indicates the 3' prime nucleotide of the 5' exon and the 5' prime nucleotide of the 3' 
exon at each exon/exon splice site. 

Fig. 2A & 2B. A schematic drawing of the genomic structure of the TAG 
gene exons is shown in Fig. 2A. Numbering is according to that obtained in Map Viewer 
on the NCBI website. Fig. 2B is a schematic drawing of the exon organization of the 
30 TAG mRNA. 
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Detailed Description of Embodiments 
Definitions 

In describing and claiming the invention, the following terminology will 
be used in accordance with the definitions set forth below. 

As used herein, the term purified" and like terms relate to an enrichment 
of a molecule or compound relative to other components normally associated with the 
molecule or compound in a native environment. The term "purified" does not necessarily 
indicate that complete purity of the particular molecule has been achieved during the 
process. A "highly purified" compound as used herein refers to a compound that is 
greater than 90% pure. 

As used herein, the term "pharmaceutically acceptable carrier" includes 
any of the standard pharmaceutical carriers, such as a phosphate buffered saline solution, 
water, emulsions such as an oil/water or water/oil emulsion, and various types of wetting 
agents. The term also encompasses any of the agents approved by a regulatory agency of 
the US Federal government or listed in the US Pharmacopeia for use in animals, 
including humans. 

A polylinker is a nucleic acid sequence that comprises a series of three or 
more closely spaced restriction endonuclease recognitions sequences. 

"Operably linked" refers to a juxtaposition wherein the components are 
configured so as to perform their usual function. Thus, control sequences or promoters 
operably linked to a coding sequence are capable of effecting the expression of the 
coding sequence. 

The term "transgene" refers to any polynucleotide which is inserted by 
artifice into a cell, and becomes part of the genome of the organism which develops from 
that cell. Such a transgene may include a gene which is partly or entirely heterologous 
(i.e., foreign) to the transgenic organism, or may represent a gene homologous to an 
endogenous gene of the organism. The term "transgenic" refers to any cell which 
includes a DNA sequence which is inserted by artifice into a cell and becomes part of the 
genome of me organism which develops from that cell. 

As used herein an "exogenous" nucleic acid or amino acid sequence refers 
to a nucleic acid or protein sequence that has been introduced into a host cell from a point 
outside the cellular membrane of the cell. Typically the exogenous nucleic acid sequence 
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is a recombinant heterologous gene (i.e. the gene contains a non-native promoter) 
however the exogenously introduced sequence may also be a gene that is endogenous to 
the cell. 

The term "non-native promoter" as used herein refers to any promoter that 
has been operably linked to a coding sequence wherein the coding sequence and the 
promoter are not naturally associated (i.e. a recombinant promoter/coding sequence 
construct). 

"Operably linked" refers to a juxtaposition wherein the components are 
configured so as to perform their usual function. Thus, control sequences or promoters 
operably linked to a coding sequence are capable of effecting the expression of the 
coding sequence. 

As used herein, a transgenic cell is any cell that comprises a nucleic acid 
sequence that has been introduced into the cell in a manner that allows expression of a 
gene encoded by the introduced nucleic acid sequence. 



As used herein, "nucleic acid," "DNA," and similar terms also include 
nucleic acid analogs, Le. analogs having other than a phosphodiester backbone For 
example, the so-called "peptide nucleic acids," which are known in the art and have 
peptide bonds instead of phosphodiester bonds in the backbone, are considered within the 
20 scope of the present invention. 

As used herein, the terms "complementary" or "complementarity" are 
used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the 
Watson & Crick base-pairing rules, i.e. two nucleic acid sequences that are capable of 
bmding to one another in an anti-parallel base paring arrangement. For example, the 
25 sequenced A-G-T 3' is complementary to the sequence 3' T-C-A 5'. Complementarity 
may be "partial," in which some of the nucleic acids' bases are not matched according to 
the base pairing rules. Or, there may be "complete" or "total" complementarity between 
the nucleic acids. 

As used herein, the term "hybridization" is used in reference to the pairing 
30 of complementary nucleic acids. Nucleic acid sequences that share a high degree of 
complementarity will bind together under high stringent conditions. For example 
condmons of high stringency comprise the use of a hybridizing solution containing the 
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nucleic acid sequences in 55% formamide (Gibco-BRL), 10% dextran sulfate, 100 ng/ul 
salmon sperm DNA in 2XSSC (300 mM NaCI, 30 mM Na citrate, pH 7.0) with a 
hybridization temperature of 37°C. Washing is then conducted using three changes of 
2XSSC at 20°C. for 15 minutes per wash with slight agitation. 

The term "peptide" encompasses a sequence of 3 or more amino acids 
wherein the amino acids are naturally occurring or synthetic (non-naturally occurring) 
amino acids. Peptide mimetics include peptides having one or more of the following 
modifications: 

1. peptides wherein one or more of the peptidyl -C(0)NR- linkages 
(bonds) have been replaced by a non-peptidyl linkage such as a -CH 2 -carbamate linkage 
(--CH 2 OC(0)NR~), aphosphonate linkage, a -CH 2 -sulfonamide (»CH 2 -S(0)2NR--) 
linkage, a urea (--NHC(O)NH-) linkage, a ~CH 2 -secondary amine linkage, or with an 
alkylated peptidyl linkage (~C(0)NR-) whFerein R is C,-C 4 alkyl; 

2. peptides wherein the N-terminus is derivatized to a -NRRj group, to a 
15 - NRC(0)R group, to a -NRC(0)OR group, to a -NRS(0) 2 R group, to a 

— NHC(0)NHR group where R and R t are hydrogen or C,-C 4 alkyl with the proviso that 
R and Ri are not both hydrogen;. 

3. peptides wherein the C terminus is derivatized to -C(0)R 2 where R 2 is 
selected from the group consisting of C,-C 4 alkoxy, and -NR3R4 where R 3 and R, are 

20 independently selected from the group consisting of hydrogen and d-C 4 alkyl. 

Naturally occurring amino acid residues in peptides are abbreviated as 
recommended by the RJPAC-RJB Biochemical Nomenclature Commission as follows: 
Phenylalanine is Phe or F; Leucine is Leu or L; Isoleucine is He or I; Methionine is Met 
or M; Norleucine is Nle; Valine is Val or V; Serine is Ser or S; Proline is Pro or P; 
Threonine is Thr or T; Alanine is Ala or A; Tyrosine is Tyr or Y; Histidine is His or H; 
Glutamine is Gin or Q; Asparagine is Asn or N; Lysine is Lys or K; Aspartic Acid is Asp 
or D; Glutamic Acid is Glu or E; Cysteine is Cys or C; Tryptophan is Trp or W; Arginine 
is Arg or R; Glycine is Gly or G, and Xaa or X is any amino acid. Other naturally 
occurring amino acids include, by way of example, 4-hydroxyproline, 5-hydroxylysine 
30 andthelike. 

Synthetic or non-naturally occurring amino acids refer to amino acids 
which do not naturally occur in vivo but which, nevertheless, can be incorporated into the 
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peptide structures described herein. The resulting "synthetic peptide" contains amino 
acids other than the 20 naturally occurring, genetically encoded amino acids at one, two, 
or more positions of the peptides. For instance, naphthylalanine can be substituted for 
trytophan to facilitate synthesis. Other synthetic amino acids that can be substituted into 
5 peptides include L-hydroxypropyl, L-3,4-dihydroxyphenylalanyl, alpha-amino acids such 
as L-alpha-hydroxylysyl and D-alpha-methylalanyl, L-aIpha.-methylalanyl, beta.-amino 
acids, and isoquinolyl. D amino acids and non-naturally occurring synthetic amino acids 
can also be incorporated into the peptides. Other derivatives include replacement of the 
naturally occurring side chains of the 20 genetically encoded amino acids (or any L or D 
10 amino acid) with other side chains. 

As used herein, the term "conservative amino acid substitution" is defined 
herein as an amino acid exchange within one of the following five groups: 
I. Small aliphatic, nonpolar or slightly polar residues: 
Ala, Ser, Thr, Pro, Gly; 
1 5 n - Polar, negatively charged residues and their amides: 

Asp, Asn, Glu, Gin; 
in. Polar, positively charged residues: 
His, Arg, Lys; 

IV. Large, aliphatic, nonpolar residues: 
20 Met Leu, He, Val, Cys 

V. Large, aromatic residues: 

Phe, Tyr, Trp 



As used herein, the term "antibody" refers to a polyclonal or monoclonal 
25 antibody or a binding fragment thereof such as Fab, F(ab')2 and Fv fragments. 

As used herein, the term "TAG polypeptide" refers to an amino acid 
sequence that comprises a sequence selected from the group consisting of SEQ ID NO: 6, 
SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 1 1, SEQ 
ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24. 
30 As used herein, the term "TAG antibody" refers to an antibody that 

specifically binds to an amino acid sequence selected from the group consisting of SEQ 
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ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO- 
11, SEQ ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24. 

As used herein, the term "biologically active fragments" or "bioactive 
fragment" of a TAG polypeptide encompasses natural or synthetic portions of the 
5 full-length protein that are capable of specific binding to their natural ligand. As used 
herein, the terms "portion," or "fragment," when used in relation to polypeptides, refer to 
a contmuous sequence of residues, such as amino acid residues, which sequence forms a 
subset of a larger sequence. Such fragment will necessarily consist of an amino acid 
sequence that is identical to a sequence present in the larger parent sequence. 
10 As used herein, the term "treating" includes alleviating the symptoms 

associated with a specific disorder or condition and/or preventing or eliminating said 
symptoms. For example, treating cancer includes preventing or slowing the growth 
and/or division of cancer cells as well as killing cancer cells. 

15 Embodiments 

The present invention is directed to a newly discovered gene family 
designated TAG (Tumor AntiGen), that encodes multiple isoforms that give rise to 
cancer antigens. The TAG gene family has at least five family memebers: TAG-1 (SEQ 
ID NO: 1); TAG-2a (SEQ ID NO: 2); TAG-2b (SEQ ID NO: 3); TAG-2c (SEQ ID NO- 
4); and TAG-3 (SEQ ID NO: 5). Through the usage of non-standard initiation codons, 
the TAG-1 gene gives rise to three proteins, TAG- la (SEQ ID NO: 6), TAG-ip (SEQ ID 
NO: 7), and TAG-2y (SEQ ID NO: 8). The TAG-2a, TAG-2b, and TAG-2c genes are all 
predicted to encode the same protein sequence, thus there is only a single TAG-2a, TAG- 
- -2p, and TAG-2y representing the products of the three genes. Accordingly, the TAG-2a, 
TAG-2b, and TAG-2c members of the gene family give rise to three proteins: TAG-2a 
(SEQ ID NO: 9), TAG-2 PX SEQ ID NO: 10), and TAG-2y (SEQ ID NO: 1 1). TAG-1 
TAG-2a, TAG-2b, and TAG-2c are characterized herein as genes that give rise to 
cancer/testis antigens. Furthermore, TAG-1, TAG-2a, TAG-2L, and TAG-2c all code for 
one or more proteins that can give rise to the antigenic peptide RLSNRLLLR (SEQ ID 
NO: 12). The RLSNRLLLR (SEQ ID NO: 12) peptide binds to the class I MHC 
molecule, HLA-A3, and is recognized by melanoma-specific CTL. The TAG-3 member 
of the gene family also encodes for three proteins (SEQ ID NO: 22, SEQ ID NO: 23 and 
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SEQ ID NO: 24), that while not giving rise to the antigenic peptide RLSNRLLLR (SEQ 
ID NO: 12), these proteins are expected to generate other cancer/testis antigens. 

In accordance with one embodiment of the present invention a purified 
polypeptide is provided comprising the amino acid sequence of SEQ ID NO: 12. In 
another embodiment a purified polypeptide is provided comprising the amino acid 
sequence of SEQ ID NO: 25. In one embodiment the polypeptide comprises the amino 
acid sequence of SEQ ID NO: 6-1 1, or an amino acid sequence that differs from any of 
those sequences by one or more conservative amino acid substitutions. In another 
embodiment the purified polypeptide comprises an amino acid sequence that differs from 
SEQ ID NO: 6-1 1 by less than 5 conservative amino acid substitutions, and in a further 
embodiment, by 2 or less conservative amino acid substitutions. In one embodiment the 
present invention is directed to a purified polypeptide that comprises an amino acid 
selected from the group consisting of 

XLSNRLLLR (SEQ ID NO: 13), wherein X is His, Arg or Lys; 
RXSNRLLLR (SEQ ID NO: 14), wherein X is Met Leu, lie or Val; 
RLXNRLLLR (SEQ ID NO: 15), wherein X is Ala, Ser, Thr, Pro or Gly; 
RLSXRLLLR (SEQ ID NO: 16), wherein X is Asp, Asn, Glu or Gin; 
RLSNXLLLR (SEQ ID NO: 17), wherein X is His, Arg or Lys; 
RLSNRXLLR (SEQ ID NO: 18), wherein X is Met Leu, He or Val; 
RLSNRLXLR (SEQ ID NO: 19), wherein X is Met Leu, He or Val;' 
RLSNRLLXR (SEQ ID NO: 20), wherein X is Met Leu, lie or Val,' and 
RLSNRLLLX (SEQ ID NO: 21), wherein X is His, Arg or Lys. 
In accordance with one embodiment of the present invention a purified polypeptide is 
provided that consists of the amino acid sequence of SEQ ID NO: 6-1 1, or a bioactive 
fragment of SEQ ID NO: 6-11, or an amino acid sequence that differs from SEQ ID NO: 
6-1 1 by one to ten conservative amino acid substitutions. 

The polypeptides of the present invention may include additional amino 
acid sequences to assist in the stabilization and/or purification of recombinant^ produced 
polypeptides. These additional sequences may include intra- or inter-cellular targeting 
peptides or various peptide tags known to those skilled in the art. In one embodiment, the 
purified polypeptide comprises an amino acid sequence selected from the group 
consisting of SEQ ID NO: 6-12 and a peptide tag, wherein the peptide tag is linked to the 
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TAG peptide sequence. Suitable expression vectors for expressing such fusion proteins 
and smtable peptide tags are known to those skilled in the art and commercially available 
In one embodiment the tag comprises a His tag. 

In another embodiment, the present invention is directed to a purified 
polypeptide that comprises an amino acid fragment of a TAG polypeptide More 
pa^cularlytheTAGpolypeptide fragment ^ists of natural or synthetic portions of a 
full-length polypeptide selected from the group consisting of SEQ ID NO- 6-1 1 that are 
capabie of specific binding to their natural ligand. Alternatively, the fragment may 
compnse an antigenic fragment, including fragments of 10-30, 12-19, 8-12 or 9 amino 
ac.ds m length, of a polypeptide selected from the group consisting of SEQ ID NO- 6-1 1 
In one embodiment the antigenic peptide fragment consists of SEQ ID NO: 12. 

The present invention also encompasses nucleic acid sequences that 
encode the TAG polypeptides. In one embodiment a nucleic acid sequence is provided 
compnsing the sequence of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO- 
4 SEQ ID NO: 5 or fragments thereof. All or part of the TAG-1, TAG-2a, TAG-2b, and 
TAG-2c genes may be used as probes for the detection of the TAG genes in biological 
samples taken from individuals with cancer or suspected of having cancer. Alternatively 
ohgonucleotide pairs based on the sequence of the TAG genes may be used as primers ' 
for the detection of the TAG genes in biological samples taken from individuals with 
20 cancer or suspected of having cancer. 

The present invention is also directed to recombinant human TAG gene 
constructs. In one embodiment, the recombinant gene construct comprises a non-native 
promoter operably linked to the amino acid coding region of a nucleic acid sequence 
selected from the group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO- 4 
SEQ ID NO: 5, or fragments thereof. The non-native promoter is preferably a strong 
constitutive promoter that allows for expression in a predetermined host cell These 
recombinant gene constructs can be introduced into host cells to produce transgenic cell 
hues that synthesize the TAGgene products. Host cells can be selected fromawide 
vanety of eukaryotic and prokaryotic organisms, and two preferred host cells are E coli 
and yeast cells. In one embodiment the host cell is a human antigen presenting cell 

In accordance with one embodiment, a nucleic acid sequence selected 
from the group consisting of SEQ ED NO: 1, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID 
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NO: 5 is inserted into a eukaryotic or prokaryotic expression vector in a manner that 
operably links the gene sequences to the appropriate regulatory sequences, and a TAG 
polypeptide is expressed in a eukaryotic or prokaryotic host cell. Suitable eukaryotic host 
cells and vectors are known to those skilled in the art. The baculovirus system is also 
suitable for producing transgenic cells and synthesizing the TAG genes of the present 
invention. One aspect of the present invention is directed to transgenic cell lines that 
contain recombinant genes that express TAG polypeptides and fragments of the TAG 
coding sequence. As used herein a transgenic cell is any cell that comprises an 
exogenously introduced nucleic acid sequence. 

In one embodiment the introduced nucleic acid is sufficiently stable in the 
transgenic cell (i.e. incorporated into the cell's genome, or present in a high copy 
plasmid) to be passed on to progeny cells. The cells can be propagated in vitro using 
standard cell culture procedure, or in an alternative embodiment, the host cells are 
eukaryotic cells and are propagated as part of a plant or an animal, including for example, 
a transgenic animal. In one embodiment the transgenic cell is a human cell and 
comprises a nucleic acid sequence selected from the group consisting of SEQ ID NO: 1- 
5. The present invention also includes non-human transgenic organisms wherein one or 
more of the cells of the transgenic organism comprise a recombinant gene that expresses 
a TAG polypeptide. 

In accordance with one embodiment a composition is provided for 
inducing an immune response against the TAG genes, proteins, and peptides described 
herein. In one embodiment the composition comprises a purified peptide that consists of 
the amino acid sequence of SEQ ID NO: 12 or SEQ ID NO: 25. In another embodiment 
the peptide consists of a sequence selected from the group consisting of SEQ ID NO: 6- 
1 1, and antigenic fragments of those sequences. Alternatively, the composition for 
inducing an immune response may comprise a nucleic acid sequence selected from the 
group consisting of SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ 
ID NO: 5. The compositions can be combined with a pharmaceutically acceptable carrier 
or adjuvant and administered to a mammalian species to induce an immune response. 
The immune response can take the form of an antibody response, a T helper response, or 
a CTL response. The immune response may be generated in vitro or in vivo. 
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In accordance with one embodiment, the TAG proteins or TAG-derived 
peptides can be used to immunize a non-human recipient such as a mouse, rat, or goat for 
the production of antibodies that specifically recognize the TAG proteins and peptides. 
Antibodies to TAG polypeptieds may be generated using methods that are well known in 
the art. In one embodiment, recombinantly produced TAG polypeptides, or fragments 
thereof are used to generate antibodies against the TAG polypeptides. The recombinantly 
produced TAG proteins can also be used to obtain crystal structures. Such structures 
would allow for crystallography analysis that would lead to the design of specific drugs 
to inhibit TAG function. 

In accordance with one embodiment an antibody is provided that binds to 
a polypeptide selected from the group consisting of SEQ ID NO: 6, SEQ ID NO: 7, SEQ 
ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, and SEQ ED NO: 11. In accordance with one 
embodiment an antibody is provided that specifically binds to all six TAG polypeptides 
(i.e. SEQ ED NOs. 6-1 1), Alternatively, a composition may be provided that comprises 
one or more antibodies specific for one or two of the individual TAG polypeptides. 
Alternatively, in one embodiment an antibody is provided that specifically binds to the 
peptide sequence of SEQ ED NO: 12. In another embodiment an antibody is provided 
that specifically binds to the peptide sequence of SEQ ED NO: 25. In one embodiment 
the antibody is a monoclonal antibody. The antibodies may be used with or without 
modification, and may be labeled by joining them, either covalently or non-covalently, 
with a reporter molecule. In addition, the antibodies can be formulated with standard ' 
carriers and optionally labeled to prepare therapeutic or diagnostic compositions. 

Antibodies to TAG polypeptides or peptide fragments thereof may be 
generated using methods that are well known in the art. For the production of antibodies, 
various host animals, including rabbits, mice, rats, goats and other mammals, can be 
immunized by injection with a TAG polypeptide (i.e. TAG-lct, TAG-lp, TAG-ly, TAG- 
2a, TAG-2P, and TAG-2y proteins), or to smaller peptides derived from those proteins. 
The whole proteins can either be synthesized or the corresponding genes can be inserted 
in an expression vector and the expressed proteins purified. Methods for expressing 
genes in expression vectors are well known in the art (Sambrook and Russell, 2001c). 
Small peptides corresponding to short amino acid sequences within the whole proteins 
can be synthesized and purified. When small peptides are used as the immunogen, they 
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may be conjugated to carrier proteins such as KLH or tetanus toxoid. Various adjuvants 
may be used to increase the immunological response, depending on the host species, and 
including but not limited to Freund's (complete and incomplete), mineral gels such as 
aluminum hydroxide, surface active substances such as lysolecithin, pluronicpolyols, 
polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and' 
potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and 
corynebacterium parvum. Methods of immunization to achieve a polyclonal antibody 
response are well known in the art, as are the methods for generating hybridomas and 
monoclonal antibodies. 

For preparation of monoclonal antibodies, any technique which provides 
for the production of antibody molecules by continuous cell lines in culture may be used. 
For example, the hybridoma technique originally developed by Kohler and Milstein 
(1975, Nature 256:495-497), as well as the trioma technique, the human B-cell 
hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the EBV- 
hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985, in 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). In an 
additional embodiment of the invention, monoclonal antibodies can be produced in germ- 
free animals utilizing recent technology (PCT/US90/02545). According to the invention, 
human antibodies may be used and can be obtained by using human hybridomas (Cote et 
al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or by transforming human B cells 
with EBV virus in vitro (Cole et al., 1985, in Monoclonal Ant ibodies and C.™™ 
Iherapx Alan R. Liss, pp. 77-96). In fact, according to the invention, techniques 
developed for the production of "chimeric antibodies" (Morrison et al., 1984, Proc. Natl. 
Acad. Sci. U.S.A. 81:6851-6855; Neuberger et al., 1984, Nature 312:604-608; Takeda et 
al, 1985, Nature 314:452-454) by splicing the genes from a mouse antibody molecule 
specific for epitopes of TAG polypeptides together with genes from a human antibody 
molecule of appropriate biological activity can be used; such antibodies are within the 
scope of this invention. 

According to the invention, techniques described for the production of 
single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce TAG protein- 
specific single chain antibodies. An additional embodiment of the invention utilizes the 
techniques described for the construction of Fab expression libraries (Huse et al., 1989, 
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Science 246:1275-1281) to allow rapid and easy identification of monoclonal Fab 
fragments with the desired specificity for egg surface proteins, derivatives, or analogs m 
one embodiment the single chain antibody specifically binds to the amino acid sequence 
ofSEQIDNO: 12. 

Antibody fragments which contain the idiotype of the molecule can be 
generated by known techniques. For example, such fragments include but are not limited 
to: the FCab'k fragment which can be produced by pepsin digestion of the antibody 
molecule; the Fab' fragments which can be generated by reducing the disulfide bridges of 
the FCab^ fragment, the Fab fragments which can be generated by treating the antibody 
molecule with papain and a reducing agent, and Fv fragments. 

Antibodies generated in accordance with the present invention may 
include, but are not limited to, polyclonal, monoclonal, chimeric (i.e "humanized" 
antibodies), single chain (recombinant), Fab fragments, and fragments produced by a Fab 
expression library. These antibodies can be used as diagnostic agents for the diagnosis of 
conditions or diseases characterized by expression or overexpression of TAG 
polypeptides (such as cancer), or in assays to monitor a patients responsiveness to an 
anti-cancer therapy. In one embodiment antibodies specific for one or more of the TAG 
polypeptides are used as diagnostics for the detection of the TAG protein in cancer cells. 

The antibodies or antibody fragments of the present invention can be 
combined with a carrier or diluent to form a composition. In one embodiment, the carrier 
is a pharmaceutical^ acceptable carrier. Such carriers and diluents include sterile liquids 
such as water and oils, with or without the addition of a surfactant and other 
pharmaceutical* and physiologically acceptable carrier, including adjuvants, excipients 
or stabilizers, niustrative oils are those of petroleum, animal, vegetable, or synthetic 
origin, for example, peanut oil, soybean oil, or mineral oil. In general, water, saline 
aqueous dextrose, and related sugar solution, and glycols such as, propylene glycol or 
polyethylene glycol, are preferred liquid carriers, particularly for injectable solutions. 

In accordance with one embodiment the detection of TAG nucleic acid 
sequences or polypeptides is used as a diagnostic mark for detecting cancer. More 
particularly, in one embodiment the detection of TAG mRNA or TAG polypeptides or 
peptides is diagnostic for cancer. In another embodiment the TAG genes, the TAG 
proteins, or the TAG-derived peptides can be used to immunize an individual to induce 
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an immune response. The induced response may include T helper cells or CTL specific 
for the TAG-derived peptides. The induced immune response may be useful in 
preventing the development of cancer in an individual without cancer, and it may be 
useful in eliminating or preventing the further spread of the disease in an individual with 
cancer. In one embodiment the TAG genes are placed in an expression vector and 
expressed in an antigen presenting cell. Alternatively, the TAG proteins or TAG-derived 
peptides may be added to antigen presenting cells. In either case, the antigen presenting 
cells will now present TAG-derived peptides which can be used to stimulate an ft, vitro T 
helper cell or CTL response. The T helper cells or CTL can then be used as diagnostics 
to detect the expression of the TAG genes in tumor or other cells. The T helper cells or 
CTL can also be infused into a cancer patient as a treatment for cancer. 

Accordingly, one embodiment of the invention is directed to the use of 
TAG polypeptides, peptides and nucleic acids as diagnostic markers for neoplastic 
disease such as cancer. The method comprises the steps of screening for elevated levels 
> or inappropriate expression of TAGs, including the expression of TAGs in somatic 
tissues. The term "inappropriate expression" includes any non-typical expression that is 
deleterious to the cell or host organism, including for example, expression in a cell type 
that normally does not express the gene product, or expression of a modified form of the 
protein that impacts cell function. Such screens could be conducted using antibodies 
specific for the TAG polypeptides. Alternatively, antibodies directed against TAG 
polypeptides can be used in assays to monitor patients being treated with anticancer 
therapies to monitor the effectiveness of the therapy. 

All or part of the TAG-1, TAG-2a, TAG-2b, and TAG-2c genes may be 
used as probes for the detection and quantification of the corresponding genes in 
biological samples isolated from an individual with cancer or suspected of having cancer 
For example, both Northern hybridization and dot blot hybridization may be used to 
detect and quantify the TAG genes. Methods for such procedures are well known in the 
art (Sambrook and Russell, 2001 a). Combinations of oligonucleotide pairs based on the 
sequence of the TAG genes may be used as PGR primers to detect the TAG gene mRNA 
in biological samples by using the reverse transcriptase polymerase chain reaction (RT- 
PCR). Specific primer pairs are illustrated in the Examples below, but other pairs can 
easily be identified by those schooled in the art Methods for RT-PCR are well known in 
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the art (Sambrook and Russell, 2001b). Because the TAG genes have been shown to be 
expressed only in cancerous tissue, placenta, and testis, their detection in biological 
samples other than placenta and testis would indicate the presence of cancer in an 
individual for which a diagnosis had not yet been made. Alterations in the level of 
mRNA relative to a control RNA sample would be useful in monitoring the prognosis of 
the disease in an individual known to have cancer, and in monitoring the results of 
immunotherapy directed against cancer cells expressing the TAG genes. 

The tumor antigens of the present invention encompass the proteins that 
can be expressed from the TAG-1, TAG-2a, TAG-2b, and TAG-2c genes. These proteins 
.nclude TAG-la, TAG-1 P , TAG-ly, TAG-2a, TAG-2 P , and TAG-2y. In accordance 
with one embodiment the tumor antigens of the present invention encompass small 
peptides, typically nine amino acids in length, but generally no less than eight and no 
more than twenty amino acids in length, that are derived from the TAG-la, TAG-ip, 
TAG-ly, TAG-20, TAG-2(3, and TAG-2y proteins. Further, because it has been shown 
that antigens can be derived by the non-traditional translation of genes (Mayrand and 
Green, 1998; Shastri et al., 2002), the tumor antigens of the present invention encompass 
any peptide that can be expressed from the TAG-1, TAG-2a, TAG-2b, and TAG-2c. 
genes, whether by traditional or non-traditional translation. 

The TAG-1, TAG-2a, TAG-2b, and TAG-2c genes are known to be 
expressed in melanoma, myelogenous leukemia, lung cancer, breast cancer, ovarian 
cancer, colon cancer, gastric cancer, and prostate cancer and thus may be used as 
immunogens to prevent, eliminate, or delay the progression of those cancers. These same 
genes may also be expressed in untested forms of cancer and thus may be useful in their 
ability to prevent, eliminate or delay the progression of additional cancers. 

Antibodies generated with specificity for the TAG-la, TAG-lp, TAG-ly, 
TAG-2a, TAG-2CJ, and TAG-2y proteins are iised in accordance with one embodiment to 
detect the corresponding proteins in biological samples. The biological sample could 
come from an individual whom is suspected of having cancer and thus detection would 
serve to diagnose the cancer. Alternatively, the biological sample may come from an 
individual known to have cancer, and detection of the TAG proteins would serve as an 
indicator of disease prognosis or treatment efficacy. Appropriate immunoassays are well 
known in the art and include, but are not limited to, immunohistochemistry, flow 
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cytometry, radioimmunoassay, western blotting, and ELISA. Biological samples suitable 
for such testing would include, but are not limited to, cells, tissue biopsy specimens, 
whole blood, plasma, serum, sputum, cerebrospinal fluid, pleural fluid, and urine. 

Antigens recognized by T cells, whether helper T lymphocytes or CTL, 
are not recognized as intact proteins, but rather as small peptides that associate with class 
I or class n MHC proteins on the surface of cells. During the course of a naturally 
occurring immune response antigens that are recognized in association with class U MHC 
molecules on antigen presenting cells are acquired from outside the cell, internalized, and 
processed into small peptides that associate with the class H MHC molecules. 
Conversely, the antigens that give rise to proteins that are recognized in association with 
class I MHC molecules are generally proteins made within the cells, and these antigens 
are processed and associate with class I MHC molecules. It is now well known that the 
peptides that associate with a given class I or class n MHC molecule are characterized as 
having a common binding motif, and the binding motifs for a large number of different 
class land II MHC molecules have been determined. It is also well known that synthetic 
peptides can be made which correspond to the sequence of a given antigen and which 
contain the binding motif for a given dass I or H MHC molecule. These peptides can 
then be added to appropriate antigen presenting cells, either in vitro or in vivo, and be 
used to stimulate a T helper cell or CTL response. The binding motifs, methods for 
synthesizing the peptides, and methods for stimulating a T helper cell or CTL response 
are all well known and readily available. 

Thus, antigens of this invention may take several forms. The TAG-1, 
TAG-2a, TAG-2b, and TAG-2c genes may be used alone, in combination with one 
another, or in combination with the genes for other antigens. The genes would be cloned 
into a vector and operationally linked to a promoter. Vectors may be chosen such that the 
genes would be expressed in bacteria or insect cells with the purpose of purifying the 
TAG-lo, TAG-1 P , TAG-ly, TAG-2a, TAG-2p\ andTAG-2 X proteins. The vector may 
be a mammalian expression vector system with the recipient cells being dendritic cells or 
cultured mammalian cell lines. Transient or stable transfection of these cells with the 
gene of interest would provide cells which can then be used either in vitro or in vivo to 
stimulate a T helper cell or CTL immune response to the antigens of this invention. 
Alternatively, the vector may include all or part of a viral or bacterial genome, for 
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example vaccinia virus, fowlpox virus, adenovirus, or BCG. Dendritic cells or cultured 
mammalian cell lines can be infected in vitro to provide antigenic cells for the stimulation 
of T helper cell or CTL responses. The viral or bacterial vectors expressing the genes of 
interest could also be used to immunize an individual with the intent of stimulating an 
immune response to the antigens of this invention. The vectors, methods of cloning, and 
methods of stimulating an immune response to the expressed genes are all well known in 
the art. 

The antigens of this invention may also take the form of the whole 
proteins TAG- la, TAG-1 P , TAG-ly, TAG-2a, TAG-20, and TAG-2y. The whole 
proteins may be added to autologous dendritic cells and used to stimulate a T helper cell 
or CTL response in vitro. The in vitro generated T helper cells or CTL.can then be 
infused into a patient with cancer (Yee et al., 2002), and specifically a patient with a form 
of cancer that expresses one or more of the TAG-1, TAG-2a, TAG-2b, and TAG-2c 
genes. The TAG- la, TAG-ip, TAG-ly, TAG-2a, TAG-20, and TAG-2y proteins may 
also be used to vaccinate an individual. The proteins may be injected alone, but most 
often they would be administered in combination with an adjuvant. The proteins may 
also be added to dendritic cells in vitro, with the dendritic cells being subsequently 
transferred into an individual with cancer with the intent of stimulating an immune 
response. 

The antigens of this invention may also take the form of small peptides. 
Peptides that bind to class I MHC molecules and that stimulate a CTL response are 
commonly nine amino acids in length, but may be as short as eight amino acids in length, 
and as long as fourteen amino acids in length. The peptides which bind to a particular 
class I MHC molecule share a common binding motif in which particular amino acid 
residues within the sequence generally have a very restricted allowable number of amino 
acids which can occupy that position, while amino acids at the remaining positions are 
largely without restriction. Due to the nature of the peptide binding site on class II MHC 
molecules, class II MHC binding peptides can be as short as ten amino acids, and may be 
as long as thirty amino acids in length. Like class I MHC binding peptides, class II MHC 
binding peptides have binding motifs for particular class H MHC molecules. Because of 
the extended nature of the class II MHC binding peptides relative to class I MHC binding 
peptides, a class II antigenic peptide may encompass many overlapping sequences with a 
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common core sequence. An extensive literature exists describing the motifs of the 
peptides that bind to the various class I and class II MHC molecules. 

Prior to identifying the TAG-1, TAG-2a, TAG-2b, and TAG-2c gene 
sequences, the RLSNRLLLR peptide (SEQ ID NO: 12) was identified as an HLA-A3 
binding peptide that was recognized by melanoma reactive CTL. This peptide is derived 
from proteins expressed from the antigenic genes of this invention, but as described in the 
Examples, had to be identified independent of any knowledge of the gene or proteins 
from which it was derived. Now mat the TAG-1, TAG-2a, TAG-2b, and TAG-2c gene 
sequences have been identified, it is possible to predict many of the antigenic peptides 
from the coded proteins. Predicted peptide antigens can be synthesized and readily tested 
in vitro for their ability to stimulate a T helper cell or CTL response. The binding motifs 
methods of peptide synthesis, and methods of in vitro stimulation and testing are all well' 
known and readily available to the skilled practitioner. ' 

It is also well-known in the art that the naturally occurring sequence of the 
antigenic peptides is not always optimal for stimulating an immune response. Peptide 
analogs can readily be synthesized that retain their ability to stimulate a particular 
immune response, but which al so gain several beneficial features which include, but are 
not limited to the following: (i) Substitutions may be made in the peptide at residues 
known to interact with the MHC molecule. Such substitutions can have the effect of 
increasing the binding affinity of the peptide for the MHC molecule and can also increase 
the lifespan of the peptide-MHC complex, the consequence of which is that the analog is 
a more potent stimulator of an immune response than is the original peptide, (ii) The 
substitutions may be at positions in the peptide that interact with the receptor on the T 
helper cells or CTL, and have the effect of increasing the affinity of interaction such that 
a stronger immune response is generated. (Hi) Additionally, the substitutions may have 
no effect on the immunogenicily of the peptide per se, but rather than may prolong its 
biological half-life or prevent it from undergoing spontaneous substitutions or 
alternations which might otherwise negatively impact on the immunogenicty of the 
peptide. 

The antigens of this invention can also be used as a vaccine for cancer 
and more specifically for melanoma and myelogenous leukemia. As described above' the 
antigens may take the form of genes, proteins, or peptides. The vaccine may include only 
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the antigens of this invention or they may include other cancer antigens that have been 
identified. Pharmaceutical carriers, diluents and excipients are generally added that are 
compatible with the active ingredients and acceptable for pharmaceutical use. Examples 
of such carriers include, but are not limited to, water, saline solutions, dextrose, or 
glyercol. Combinations of carriers may also be used. The vaccine compositions may 
further incorporate additional substances to stabilize pH, or to function as adjuvants, 
wetting agents, or emulsifying agents, which can serve to improve the effectiveness of 
the vaccine. 

The composition may be administered parenterally or orally, and, if 
perenterally, either systemically or topically. Parenteral routes include subcutaneous 
intravenous, intradermal, intramuscular, intraperitoneal, intranasal, transdermal, or buccal 
routes. One or more such routes may be employed. Parenteral administration can be, for 
example, by bolus injection or by gradual perfusion over time. Alternatively, or 
concurrently, administration may be by the oral route. 

It is understood that the suitable dosage of an immunogen of the present 
invention will depend upon the age, sex, health, and weight of the recipient, the kind of 
concurrent treatment, if any, the frequency of treatment, and the nature of the effect 
desired, however, the most preferred dosage can be tailored to the individual subject, as 
determined by the researcher or clinician. The total dose required for any given treatment 
will commonly be determined with respect to a standard reference dose based on the 
experience of the researcher or clinician, such dose being administered either in a single 
treatment or in a series of doses, the success of which will depend on the production of a 
desued immunological result (i.e., successful production of a T helper cell and/or CTL- 
mediated response* the antigen, which response gives rise to the prevention and/or 
treatment desired). Thus, the overall administration schedule must be considered in 
determining the success of a course of treatment and not whether a single dose, given in 
elation, would or would not produce thedesired immunologically therapeutic result or 
effect Thus, the therapeutically effective amount (i.e., thatproducing the desired T 
helper cell and/or CTL-mediated response) will depend on the antigenic composition of 
the vaccine used, the nature of the disease condition, the severity of the disease condition 
the extent of any need to prevent such a condition where it has not already been detected ' 
the manner of administration dictated by the situation requiring such administration the ' 
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weight and state of health of the individual receiving such administration, and the sound 
judgment of the clinician or researcher. Needless to say, the efficacy of administering 
additional doses, and of increasing or decreasing the interval, may be re-evaluated on a 
continuing basis, in view of the recipent's immunocompetence (for example, the level of 
T helper cell and/or CTL activity with respect to tumor-associated or tumor-specific 
antigens). 

The concentration of the T helper or CTL stimulatory peptides or proteins 
of the invention in pharmaceutical formulations are subject to wide variation, including 
anywhere from less than 0.01% by weight to as much as 50% or more. Factors such as 
volume and viscosity of the resulting composition should also be considered. The 
solvents, or diluents, used for such compositions include water, possibly PBS (phosphate 
buffered saline), or saline itself, or other possible carriers or excipients. The immunogens 
of the present invention may also be contained in artificially created structures such as 
liposomes, which structures may or may not contain additional molecules, such as 
proteins or polysaccharides, inserted in the outer membranes of said structures and 
having the effect of targeting the liposomes to particular areas of the body, or to 
particular cells within a given organ or tissue. Such targeting molecules may commonly 
be some type of immunoglobulin. Antibodies may work particularly well for targeting 
the liposomes to tumor cells. 

The present invention is also directed to a vaccine in which a peptide or 
polypeptide or active fragment of the present invention is delivered or administered in the 
form of a polynucleotide coding the peptide or polypeptide or active fragment, whereby 
the peptide or polypeptide or active fragment is produced in vivo. The polynucleotide 
may be included in a suitable expression vector and combined with a pharmaceutically 
acceptable carrier. 

The vaccine compositions may be used prophylactically for the purposes 
of preventing cancer in an individual that does not currently have cancer, or they may be 
used to treat an individual that already has cancer. Prevention relates to a process of 
prophylaxis in which the individual is immunized prior to the induction or onset of 
cancer. For example, individuals with a history of severe sunburn and at risk for 
developing melanoma, might be immunized prior to the onset of the disease. 
Alternatively, individuals that already have cancer can be immunized with the antigens of 
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the present invention so as to stimulate an immune response that would be reactive 
against the cancer. A clinically relevant immune response would be one in which the 
cancer is completely regresses and is eliminated from the patient, and it would also 
include those responses in which the progression of the cancer is blocked without being 
eliminated. 

In one embodiment, the present invention provides methods of screening 
for agents, small molecules, or proteins that interact with polypeptides comprising a 
sequence selected from the group consisting of SEQ ID NO: 6-1 1 or bioactive fragments 
thereof. The invention encompasses both in vivo and in vitro assays to screen small 
molecules, compounds, recombinant proteins, peptides, nucleic acids, antibodies etc 
whxch bind to or modulate the activity of TAG polypeptide and are thus useful as 
therapeutic or diagnostic markers for cancer. As used herein, modulating the activity of a 
TAG polypeptide includes interfering or altering the TAG polypeptides ligand binding 
properties. to 

Example 1 Isolation of the TAG Genes 

Cell Lines 

The melanoma lines A375, AVL3-Mel, DM6, DM13, DM14 DM93 
DM122, DM281, DM319, DM331, DM472, EB81-Mel, HT144, LB373-Mei, NaS-Mel 
SK-MelL-2, SK-Mel-5, SK-Mel-28, VMM1, VMM5, VMM12, VMM15, VMM17 
VMM18, VMM19, VMM34, VMM39, VMM39, VMM64, VMM86, VMM105 
VMM150, VMM273, and VMM330 were maintained in RPMI1640 supplemented with 
5-10% FBS and 2 mM L-glutamine. K562, a myelogenous leukemia (Lozzio and Lozzio, 
1979 Leuk Res 3, 363-370), and the B-lymphoblastoid cell lines JY, VMM12-EBV, and ' 
VMM18-EBV were maintained in the same media. C1R-A3 and T2-A3, were 
maintained in the same media supplemented with 200 ^ig/ml G418. 

The lung tumor lines SK-Mes-1, SK-LU-1, Calu-1, VLU-6, VLU-19, 
VBT-2, and TTB-250; the breast tumor lines MCF-7 (KR), MDA-MB-468, MDA-MB- 
453, TTB-173, VA-B5A SK-BR-3, BRC-751, and BRC-173; ovarian tumor lines CA- 
OV-14, SK-OV-3, TTB-6, and VAO-12; colon tumor lines CCL-228, CL-188, HT-29, 
VCR-8, SW-48, and VCR-4; brain tumor lines CRL-1690, HTB-12, HTB-14, and HTB- 
17; prostate tumor lines Dul45, LnCap, and PC-3; pharyngeal squamous cell carcinoma 
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FaDu; tongue squamous cell carcinoma SCC4; and cervical cell carcinoma SIHA were 
also used and cultured in the same manner as the melanoma tumor cell lines. Four 
cryopreserved prostate carcinoma clinical isolates were also obtained from the Tissue 
Procurement Facility at the University of Virginia. cDNA from a prostate carcinoma was 
purchased from Biochain. 

CTL Line 

VMM18-specific CTL have been described previously (Skipper et al., 
1996, J Immunol 157, 5027-5033). CTL were expanded in bulk culture using anti-CD3 
antibody (Greenberg and Cheever, 1985, Surv Immunol Res 4, 283-296) and 
cryopreserved in aliquots of 1 - 5 x 107 cells for use in epitope reconstitute assays. 

Isolation of HLA-A3 Associated Peptides 

Immunoaffinity purification of class I MHC molecules from aliquots of 
6 - 8 x 10 10 VMM18 tumor cells was performed as described (Hogan et al., 1998, Cancer 
Res 58, 5144-5150), except that the HLA-A3-specific monoclonal antibody GAP-A3, 
bound to protein A-Sepharose, was used to isolate the HLA-A3 molecules. 

Peptide Fractionation 

Peptide extracts were fractionated by RP-HPLC using an Applied 
Biosystems model MOB system. The extracts were concentrated by vacuum 
centrifugation and injected onto a Higgins (Mountain View, CA) C18 HAISIL column 
(2.1 mm x 4 cm, 300 A, 5 p.m). The peptides were eluted with a gradient of 
acetonitrile/0.085% trifluoroacetic acid (TFA)* in 0.1% TFA/water, with the 
concentration of acetonitrile increasing from 0 to 9% (0 to 5 min), 9 to 36% (5 to 55 
min), and 36 to 60% (55 to 62 min). Second dimension fractionations of selected first 
dimension (TFA) fractions were accomplished using the same gradient but with the 
substitution of heptafluorobutyric acid (HFBA) for TFA. A third dimension of RP-HPLC 
was achieved using an Eldex (Napa, CA) MicroPro pump, a homemade C18 
microcapillary column and an Applied Biosystems model 785A UV absorbance detector. 
The column was made by packing a 27-cm bed of 10-pm CI 8 particles in a section of 
285 urn o.d. x 75 urn i.d. fused silica. Peptides in a selected second dimension fraction 
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were loaded onto this column and eluted with a gradient of acetonitrile/0.67% 
triethylamine acetate (TEAA)/water in 0.1% triethylamine acetate/water, with the 
concentration of acetonitrile increasing from 0 to 60% in 40 min. The flow rate was 
approximately 300 nl/min, and fractions were collected into 25 ul of 0.1% acetic acid 
every 30 seconds. In all RP-HPLC experiments, peptides were detected by monitoring 
UV absorbance at 214 nm. 

CTL Epitope Reconstitution Assay 

Aliquots of each RP-HPLC fraction were tested for the presence of 
peptides that could sensitize C1R-A3 targets for lysis by VMM18 CTL in standard four- 
hour 51Cr-release assays as previously described (Hogan et al., 1998, Cancer Res 58, 
5144-5150). 

Mass Spectrometric Analyses 

Active RP-HPLC fractions were screened by on-line RP- 
HPLC/electrospray ionization mass spectrometry (MS) using a homemade microcapillary 
column and a Finnigan-MAT TSQ 7000 triple quadrupole mass spectrometer (Finnigan, 
San Jose, CA). Approximately one percent of the active RP-HPLC fraction was loaded 
onto a section of 185-um o.d. x 75-um i.d. fused silica packed with 10 to 12 cm of 10 ^m 
C18 particles. Peptides were eluted directly into the mass spectrometer using a 10- 
minute 0-60% acetonitrile in 0. 1 M acetic acid gradient. Ions were formed by 
electrospray ionization, and mass spectra were recorded by scanning between mass to 
charge ratios (m/z) 300 and 1400 every 1.5 seconds. 

Active second dimension HPLC fractions were analyzed using an effluent 
splitter on the microcapillary HPLC column. The column (360-um o.d. x lOO-^im i.d. 
with a 25-cm CI 8 bed) was connected with a zero dead volume tee (Valco, Houston, TX) 
to two pieces of fused silica of different lengths (25-um and 40-^m i.d.). Peptides were 
eluted with a 34-minute gradient of 0-60% acetonitrile in 0.1 M acetic acid. The 25-^m 
capillary deposited one-fifth of the HPLC effluent into the wells of a microtiter plate for 
use in a CTL epitope reconstitution assay, while the remaining four-fifths of the effluent 
was directed into the mass spectrometer, with mass spectra recorded as described above. 
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Peptide sequences were determined by collision-activated dissociation 
(CAD) tandem mass spectrometry using an LCQ (Finnigan) ion trap mass spectrometer 
and methods as described (Cox et al., 1994, Science 264, 716-719). 

5 Peptide Synthesis 

Peptides were synthesized using a Gilson (Madison, WI) AMS 422 
multiple peptide synthesizer using conventional FMOC chemistry. Peptides were 
purified by RP-HPLC using a 4.6-mm i.d. x 100-mm long POROS (Perseptive 
Biosystems, Cambridge, MA) column and a 10-minute 0 to 60% acetonitrile in 0.1% 
10 TF A gradient 

Total mRNA Isolation 

Total RNA was prepared from 2-10 x 10 6 cells using the RNeasy® Mini 
kit (Qiagen, Valencia, CA) as per the kit instructions. RNA was quantified by 
1 5 absorbance at 260 nm. 

PCR Primers 

The gene specific primers (GSP) 1361 and 1362 are specific for GAPDH 
and the remaining primers are directed towards the TAG gene. 

20 1361: 5 '-CCACCCATGGCAAATTCCATGGCA-3 ' (SEQ ID NO: 26) 
1362: 5 ' -TCTAGACGGC AGGTCAGGTCCACC-3 ' (SEQ ID NO: 27) 
A52: 5'-AGGAAGGGGCTCCCACAGTGC-3' (SEQ ID NO: 28) 

A73: 5 '-AGCGGCGGGCTGAAGGA-3 ' (SEQ ID NO: 29) 

A73.92: 5 '-AGCGGCGGGCTGAAGGACTC-3 ' (SEQ ED NO: 30) 

25 C723:5'-CCCAGGTTAGAACGGTCAGCAGAA-3' (SEQ ID NO: 31) 
E600: 5'-GAGGGTAGGGTGGTCATTGTGTCA-3 ' (SEQ ID NO: 32) 
F473: 5 '-CAGCACAACAGGAACATTCAGTGG-3 ' (SEQ ID NO: 33) 
G608: 5 '-G<jGGGATTTTATTGCGGTGAAAGT-3 ' (SEQ ID NO: 34) 
RLS-F-A: 5'-CCAGGAAGGGGCTCCCACAGT-3' (SEQ ID NO: 35) 

30 RLS-F-B: 5 ' -CTGTCACGTCTCAGCAATAGA-3 ' (SEQ ID NO: 36) 

RLS-F-15: 5 ' -AAGGACTCCTCAAGTGCCACCAAAG-3 ' (SEQ ID NO: 37) 
RLS-F-180: 5 ' -GGAAGGGGCTCCCACAGT-3 ' (SEQ ID NO: 38) 
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RLS-F-216: 5'-ACTCCTCAAGTGCCACCAAA-3' (SEQ ID NO: 39) 
RLS-R-33 1 : 5 '-CTGCTTACCTCAAGAGCAGTCT-3 ' (SEQ ID NO: 40) 
RLS-R-1 19: 5 '-GCAGTCTATTGCTGAGACGTGACAG-3 ' (SEQ ID NO: 41) 

RT-PCR 

RT-PCR (Promega, Madison, WI) was used to screen VMM12 and 
VMM18 mRNA for the expression of a gene coding the RLSNRLLLR sequence. The 
primerpairsRLS-F-180/RLS-R-331 and RLS-F-216/RLS-R-331 were used to amplify 
152 bp and 1 16 bp ftagments, respectively. RT-PCR conditions were: 48°C for 45 min; 
94°C for 2 min; 35 cycles of 94°C for 30 s, 50°C for 60 s, 68°C for 60 s; 68°C for 5 min. 
For all other PCR reactions, total RNA was first converted to cDNA by using the 
SuperScriptTM First-Strand Synthesis System (Invitrogen, Carlsbad, CA). PCR was then 
performed on 250 ng of cDNA using Platinum Taq High Fidelity (Invitrogen). The PCR 
mixes were heated to 94°C for 2 min, 30 cycles of amplification were performed, 
followed by a final extension at 68°C for 5 min. When amplifying the TAG genes, the 30 
cycles consisted of 94°C for 30 s, 62°C for 30 s, and 68°C for 60 s. When the GAPDH 
gene was amplified, the 30 cycles consisted of 94°C for 30 s, 60°C for 30 s, and 68°C for 
60 s. The PCR products were visualized on ethidium bromide stained agarose gels. 

DNA Sequencing 

Automated DNA sequencing was performed at the University of Virginia 
DNA Sequencing Core on either an Applied Biosystems 377 Prism DNA Sequencer or 
3100 Genetic Analyzer, using Big Dye terminator chemistry with Taq DNA polymerase. 

Rapid Identification of cDNA Ends (RACE) 

The GeneRacer™ system (Invitrogen) was used to perform both 5' and 3' 
RACE. For the 5' RACE procedure, the GeneRacer™ 5' Primer was used in conjunction 
with the GSP RLS-R-1 19 (5'-GCAGTCTATTGCTGAGACGTGACAG-3'; SEQ ID NO: 
41). Cycling conditions were: 94°C for 2 min; 5 cycles of 94°C for 30 s, 76°C for 2 min; 

5cyclesof94«'Cfor30s,74°Cfor2min;5cyclesof94«'Cfor30s,72°Cfor2min; 15 ' 
cycles of 94°C for 30 s, 70°C for 30s, 72°C for 2 min; 72"C for 5 min). 
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Nested PCR was used for the 3' RACE procedure. Outside reactions used 
the GeneRacer™ 3' primer in conjunction with either of the GSP primers RLS-F-A or 
RLS-F-15. Cycling conditions for the RLS-F-A PCR consisted of 94°C for 2 min; 5 
cycles of 94°C for 30 s, 68°C for 2 min; 5 cycles of 94°C for 30 s, 66°C for 2 min; 20 
cycles of 94°C for 30 s, 61°C for 30 s, 68°C for 2 min; 68°C for 10 min. Cycling 
conditions for the RLS-F-15 PCR consisted of 94°C for 2 min; 5 cycles of 94°C for 30 s, 
72°C for 2 min; 5 cycles of 94°C for 30 s, 70°C for 2 min; 20 cycles of 94°C for 30 s, 
65°C for 30 s, 68°C for 2 min; 68°C for 10 min. Inside reactions used the 3' 
GeneRacer™ nested primer with the GSP primer RLS-F-B. Cycling conditions were: 
94°C for 2 min; 14 cycles of 94°C for 30 sec, 76°C (decreasing 0.5°C per cycle) for 2 
min; 16 cycles of 94°C for 30 s, 68°C (decreasing 0.5°C per cycle) for 30 s, 68°C for 2 
min; 68°C for 10 min. 

The PCR products were visualized on ethidium bromide stained low 
melting agarose gels, and selected bands were purified using the QIAquick® (Qiagen) 
purification system. The purified DNA was cloned into pCR4-TOPO® (Invitrogen), 
transformed into One Shot® TOP10 Chemically Competent E coli (Invitrogen), and 
selected with 100 ug/ml ampicillin on LB agar. DNA from individual colonies was 
purified using the Qiagen Plasmid Mini Kit. 

VMM1 8 CTL recognize three distinct HLA-A3-restricted epitopes 

The peptides bound to HLA-A3 molecules on 8 x 10 10 VMM18 tumor 
cells were purified as described in Materials and Methods, and fractionated by RP-HPLC 
using TFA as the organic modifier. A CTL epitope reconstitution assay was performed 
using 2.5% of each RP-HPLC fraction (2 x 10 9 cell equivalents), and three peaks of 
activity were observed. Peak B activity (fractions 26 - 28) corresponds to the previously 
described ALLAVGATK (SEQ ED NO: 42) peptide from Pmell7/gpl00. 

Identification of the Antigenic Peptides 

Pooled active fractions 15-17 (peak A), and active fraction 38 (peak C) 
were each further fractionated by RP-HPLC using HFBA as the organic modifier. In 
CTL epitope reconstitution assays, fractions 26 and 27 of this second fractionation of 
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peak A contained the active peptide, and fractions 66 and 67 of a second fractionation of 
peak C contained the active peptide. The peptides were further fractionated by a third 
round of RP-HPLC, using TEAA as the organic modifier. In CTL epitope reconstitution 
assays, the peak A antigenic peptide was found to elute in fractions 22 - 24, while the 
peak C antigenic peptide was present primarily in fractions 32 - 34. 

The active peptide in peak A is SQNFPGSQK (SEQ ID NO: 25) 

Mass spectrometry analysis of the active third dimension RP-HPLC 
fractions representing peak A indicated that the abundance of the m/z 497 ion strongly 
correlated with the CTL epitope reconstituting activity. Analysis of fragment masses 
obtained from the CAD mass spectrum allowed the determination of the peptide sequence 
as SQNFPGSQK (SEQ ID NO: 25). This synthetic peptide was active in sensitizing 
C1R-A3 targets for lysis by VMM18 CTL at concentrations as low as 10 pM (Fig. 1). It 
was subsequently determined by RP-HPLC and mass spectrometry that the synthetic 
peptide SQNFPGSQK (SEQ ID NO: 25) co-eluted with the unknown m/z 497 in the 
active second and third dimension fractions (data not shown), indicating that this 
sequence represents the naturally processed and presented epitope. 

The active peptide in peak C is RLSNRLLLR (SEQ ID NO: 12) 

Analysis of the active third dimension peak C fractions showed that the 
biological activity in epitope reconstitution assays correlated with the abundance of the 
m/z 571 ion. Analysis of the CAD mass spectra suggested that the peptide sequence 
included four leucine or isoleucine residues (labeled as X because these residues are not 
distinguishable by low-energy CAD). A mixture of peptides was therefore synthesized, 
with leucine and isoleucine incorporated at each of four positions in the sequence 
RXSNRXXXR (SEQ ID NO: 44), and this peptide cocktail of 1 6 peptides had potent 
epitope reconstituting activity (Fig. 2A). Each of the sixteen peptides was individually 
synthesized and tested in epitope reconstitution assays. A range of activities was 
observed, with most of the sequences sensitizing C1R-A3 targets for at least some 
VMM18 CTL-specific lysis, and with no one sequence being significantly and 
reproducibly superior to all of the others (data not shown). Subsequent RP-HPLC co- 
elution studies clearly demonstrated, however, that the unknown m/z 571 in the active 
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fractions was RLSNRLLLR (SEQ ID NO: 12), and the epitope reconstitute assay 
showed that this peptide is active at concentrations as low as 10 pM (Fig. 2B). 

Both SQNFPGSQK (SEQ ID NO: 25) and RLSNRLLLR (SEQ ED NO: 12) are presented 
on at least one other HLA-A3+ melanoma. 

Mass spectrometry analysis of RP-HPLC fractionated peptides eluted 
from immunoaffinity purified HLA-A3 from the melanoma cell line VMM12 
demonstrated that SQNFPGSQK (SEQ ID NO: 25) and RLSNRLLLR (SEQ ID NO: 12) 
peptides were both present in the expected fractions (data not shown) and thus, both 
peptides represent novel shared melanoma antigens. 

BLAST Search Results for the Gene(s) Coding for the SQNFPGSQK (SEQ ID NO: 25) 
and RLSNRLLLR (SEQ ID NO: 12) peptides 

Homology searches of SQNFPGSQK (SEQ ID NO: 25) in the public non- 
redundant human protein database yielded no exact matches, although the seven N- 
terminal amino acids of the peptide had an exact match in the Pmell7/gpl00 sequence 
(residues 87-95). Nucleotide sequencing ofthePmell7/gpl 00 RT-PCR product from 
VMM18 cells yielded an exact match to the published sequence in this region, with no 
evidence of heterogeneity at these codons (data not shown), suggesting that the sequence 
does not arise as the result of a mutation or rearrangement of the PmeI17/gplOO gene. 
A homology search of the RLSNRLLLR (SEQ ID NO: 12) peptide yielded three exact 
matches: (i) AE003619, a drosophila melanogaster genomic scaffold gene; (ii) 
AC106771, Homo sapiens chromosome 5 clone RP1 1-308B16; and (iii) AC106790, 
Homo sapiens chromosome 5 clone RP1 1-376E20. The human sequences are 
overlapping clones, and in both cases the sequence coding for the RLSNRLLLR (SEQ ID 
NO: 12) peptide is immediately followed by a stop codon, suggesting that the peptide 
might occur at the C-terminal end of a protein expressed from a gene coded for in these 
two clones. To determine if such a gene was expressed in VMM1 8, PCR primers were 
designed to amplify a region of DNA that would encompass that coding for the 
RLSNRLLLR (SEQ ID NO: 12) peptide, as well as sequence immediately 5* to that 
region. Two primer sets (RLS-F-180/RLS-R-331 and RLS-F-2I6/RLS-R-331) 
respectively amplified the predicted 152 bp and 1 16 bp fragments from both VMM12 and 
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VMM18 cDNA (data not shown), thus confirming that a gene encompassing this region 
was expressed in melanoma tumor ceil lines known to express the peptide. 

Identification of the gene coding for the source protein containing the RLSNRLLLR 
(SEQ ID NO: 12) peptide 

The GeneRacer™ method of 5' prime RACE was chosen as it ensures the 
amplification of full-length mRNA by directing the ligation of GeneRacer™ RNA Oligo 
to mRNA that has not been truncated at the 5' end. PCR was performed with the 
GeneRacer™ 5' Primer and the 3' reverse primer, RLS-R-1 19, that was designed to 
overlap partially the nucleotide sequence coding for the RLSNRLLLR (SEQ ID NO: 12) 
peptide. An -200 bp fragment was obtained, cloned into pCR4-TOPO, and sequenced. 
A BLAST search of the obtained sequence demonstrated that it was completely 
homologous to AC106771 and overlapped with the sequence obtained from the 152 and 
1 16 bp fragments. The 5' end of the insert read directly into the complete GeneRacer™ 
RNA Oligo sequence, thus confirming that the complete 5' end of the gene had been 
obtained. 

3' RACE was then used to obtain 3' sequence information for the 
RLSNRLLLR(SEQ ID NO: 12)-coding gene. The two sets of primers yielded two 
dominant fragments each, and the difference in the size of the fragments between the two 
primer sets corresponded to the predicted size difference based on the location of the 5' 
GSP. The fragments were cloned into P CR4-TOPO and sequenced. A total of four 
different sequences were obtained for the 3' end of the gene. The 3' end of the sequences 
corresponded to the GeneRacer™ Oligo dT primer, thus indicating that the 3' primer end 
of the genes had been obtained. 

Example 2 Characterization of the TAG Genes 
Gene Structure 

By combining the 5' and 3' sequence information, a total of four different 
isoforms of the gene could be constructed, TAG-1, TAG-2a, TAG-2b, and TAG-2c (Fig. 
3). These sequences were further confirmed by sequencing clones obtained following 
RT-PCR with primers specific for the 5' and 3' end of each isoform. The isoforms are 
composed of three to four exons each, with each having the al and a2 exons in common 
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at the 5' end of the gene. BLAST searches indicate that the genes are coded for on the 
short arm of Chromosome 5, and have 100% identity to sequences in clones AC106771, 
AC106790, and AC1 19151. The seven identified exons span approximately 230,000 
nucleotides in the genomic sequence (Fig. 4). Appropriate splice sites exist at each of the 
intron/exon boundaries to allow splicing of the exons to occur. During the course of 
sequencing clones that corresponded to TAG-2c, an additional isoform (TAG-3) was 
isolated that lacked the a2 exon and was composed of the al, a4, and o7 exons. The 
splicing of the al exon to the o4 exon changed the nucleotide sequence such that the 
carboxy-terminal Arg in the RLSNRLLLR (SEQ ID NO: 12) peptide was replaced by a 
Ser (RLSNRLLLS; SEQ ED NO: 46). Although no significant open reading frame 
initiated from an AUG codon exist exists within the sequences, there are three 
nonstandard initiation codons (two CUG and one ACG), all of which are in frame with 
one another, and all of which initiate an open reading frame that would code for the 
peptide (Fig. 3). The sequence coding for the RLSNRLLLR (SEQ ID NO: 12) peptide 
spans the junction between the first two exons, with the first 26 nucleotides coming from 
the al exon and the 27th nucleotide coming from the o2 exon. 

Protein Structure 

Depending upon the initiation codon used, the TAG-1 gene potentially 
codes for a 99 amino acid (aa) (TAG- la), a 63 aa (TAG-IB), and 59 aa peptide (TAG- 
ly), with respective molecular weights of 10,615 D, 6,945 D, and 6,577 D While the 
TAG-2a, TAG-2b, and TAG-3b genes differ from one another in their fourth exon, all of 
them potentially express identical proteins as the stop codon is located in the third exon. 
These genes would use the same initiation codons as in the TAG-1 gene, but would differ 
from the TAG-1 gene at their 3' end. The expressed proteins would be 93 (TAG-2a), 57 
(TAG-28), and 53 (TAG-2y) aa in length, with molecular weights of 9,727 D, 6,057 D, 
and 5,689 D. The TAG-1 protein isoforms, but not the TAG-2 protein isoforms, contain 
the sequence Asn-Ser-Thr and thus could potentially exist in a glycosylated form. The 
TAG-1 isoforms have three cysteines and TAG-2 isoforms have four cysteines, which 
could lead to interchain or intrachain disulfide bond formation. A BLAST search of the 
TAG-1 and TAG-2 protein isoforms does not reveal any significant homology with 
known proteins. 
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Expression of the TAG-1, -2a, -2b, and 2c genes in melanoma tumor lines 

PCR reactions specific for each of the TAG-l,-2a, -2b, and -2c genes were 
performed on cDNA obtained from 32 established melanoma lines (Table 1). Products 
were visualized on ethidium bromide stained agarose gels. Screening was initially 
performed with 30 rounds of amplification with (+) product being easily visualized; (+/-) 
product cousl be visualized, but the band was very light; (*) product visible only 
following 40 rounds of amplification; (-) product not visible after 30 or 40 rounds of PCR 
amplification. When comparing the expression of the four genes in any given tumor line, 
TAG-1 and -2a were expressed at the highest levels, TAG-2b was poorly expressed, and ' 
TAG-2c was expressed at an intermediate level. With the exception of EB81-Mel, each 
tumor line expressed all four genes or none at all. In the case of EB81-Mel, the TAG-1, - 
2a, and -2b genes were only seen following 40 cycles of amplification. Overall, TAG-1, 
2a, and -2b are expressed in 88% of the melanoma lines tested, while TAG-2c is 
expressed in 84% of the melanoma lines tested. To ensure that the expression of the TAG 
gene family was not an artifactof in vitro culture conditions, mRNA was prepared from a 
cryopreserved aliquot of the original tumor sample from which the VMM12 tumor line 
was established. RT-PCR was positive for each of the TAG genes, thus establishing that 
TAG is expressed in uncultured melanoma cells (data not shown). TAG-3 appears to be 
barely detectable in some, but not all of the melanoma samples. 
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Table 1. Expression of TAG-1, TAG-2a, TAG-2b, and TAG-2c in Established 
Melanoma Cell Lines* 



Tissue 



TAG-1 



TAG-2a 



TAG-2b 



TAG-2c 



AVL3-MEL 



+ 



+/- 



+/- 



+/- 



DM93 



DM122 



DM281 



DM319 



DM331 



+ 



+/- 



+/- 



+/- 



EB81-Mel 



LB373-Mel 



Na8-Mel 



SK-Mel-2 



+/- 



SK-Mel-28 



+/- 



VMM1 



VMM5 



VMM12 



VMM 15 



+ 
+ 



VMM19 



VMM34 



VMM39 



+ 
+ 



+/- 



+ 



VMM86 



+/- 



VMM105 



VMM150 



VMM273 



VMM330 



Total Positive 0 



% Positive 



29/32 



(91%) 



28/32 



(88%) 



+/- 



+/- 



28/32 



(88%) 



+/- 



27/32 



(84%) 



TPCR ( was performed as described in Materals and Methods. (+) product was easily 
vjsuahzed; (f/-) product could be visualized, but the band was ve'ry light; "prSuct not 
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Expression of the TAG-1, -2a, -2b, and 2c genes in transformed and malignant leukocyte- 
derived cell lines 

RT-PCR of mRNA derived from multiple B-LCL and from the hybrid T-B LCL, 
T2-A3, demonstrates that the TAG gene family is not expressed in transformed B or T cells 
(Table 2). All four TAG genes were, however, expressed in K562, a myelogenous leukemia cell 
line (Table 2). PCR products were visualized on ethidium bromide stained agarose gels. 
Screening was initially performed with 30 rounds of amplification with (+) product being easily 
visualized; (+/-) product cousl be visualized, but the band was very light; (-) product not visible 
after 30 or 40 rounds of PCR amplification. 



10 



15 



20 



25 



Expression of the TAG-1, -2a, -2b, and 2c genes in normal tissue 

The expression of the TAG family of genes was determined in mRNA derived 
from normal tissues (Table 3). Products were visualized on ethidium bromide stained agarose 
gels. Screening was initially performed with 30 rounds of amplification with (+) product being 
easily visualized after 30 rounds of amplification (*) product visible only following 40 rounds of 
amplification; (-) product not visible after 30 or 40 rounds of PCR amplificatioa The results, 
demonstrated that with the exception of the testis and placenta, the TAG genes are not expressed 
in normal tissue. The expression of TAG-1 can be seen in the placenta following 30 cycles of 
amplification, andTAG-2a is weakly detectable. Upon 40 cycles of amplification, TAG-1, -2a, 
and -2b are easily detected in placenta, but TAG-3b is not visualized. TAG-1 and -2a expression 
is readily observed in the testis following 30 cycles of amplification, and all four genes are 
detectable following 40 cycles of amplification. The expression of the TAG genes in testis and 
placenta, but not in other normal tissues indicates that they share expression profiles with other 
cancer/testis antigens. 

Jeu lL EXPreSS '° n ° f TA ^" 1 :. T ^ G ' 2a - TAG - 2b -i" d TAG -2c ^ Non-Melanoma C ell Lines' 



C1R-A3 



JY 



Cell Type 



B-LCL 



B-LCL 



TAG-1 



TAG-2a 



TAG-2b 



TAG-2c 



B-LCL 



VMM15-EBV 



VMM18-EBV 



B-LCL 



B-LCL 



30 



K562 



Hybrid B/T-LCL 



Myelogenous Leukemia 



iS^ii; as described in MateraTsand Metho ds. '(+) product was eastty visualized- 
(+/-) product could be visualized, but the band was very light;. (-) product not visible 
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TW EXPreSSidn ° f TAG - 1 ' TAG " 2a - TAG-2b, and TAG-2c in Normal, Human 



10 



15 




4ualizS S mund? f "r^ ".^f ralS md Meth ° ds - (+) P roduct easily 
of E«St *• ^? I « P«>duct was not visible after 30 or 40 rounds 

of amplification; (*) product visible following 40 rounds of amplification, but not 
following 30 rounds of amplification. 

BLAST search of the TAG genes in the Human EST database 

A BLAST search of the TAG genes against the GenBank Human EST 
database yielded homology with two chronic myelogenous leukemia (CML) sequences 
and one hepatocellular carcinoma sequence. CML clone, BF210037, has 96% identity of 
a 603 bp sequence with the al, a2, and a3 exons of TAG-1, while CML clone, 
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BF240333, has a 191 bp region out of 741 bp that is 94% identical to the al exon of 
TAG-1, -2a, -2b, and -2c through the first eighteen nucleotides coding the RLSNRLLLR 
(SEQ ID NO: 12) peptide, after which the sequences diverge. The hepatocellular 
carcinoma clone, AV695059, has a 272 bp region out of 421 bp that is 98% identical with 
the TAG al exon and all but the last 3 bp of the o2 domain, after which the sequence 
diverges. These results demonstrate that the TAG genes may be expressed in a variety of 
tumors, and that there may be additional isoforms that we have not yet identified. 

Expression of the TAG-1, -2a, -2b, and -2c genes in tumors of other than melanocyte 
origin 

The expression of the TAG family of genes was determined in mRNA 
derived from a variety of cancer cell lines and/or fresh cancer tissue (Table 4). The 
results demonstrated that at least one isoform of the TAG genes was expressed in lung, 
breast, ovarian, colon, gastric and prostate carcinomas. Expression was not observed in 
brain tumors, pharyngeal squamous carcinoma, tongue squamous cell carcinoma, and 
cervical squamous cell carcinoma. With only one to four samples of each of the latter 
cancers tested, it is possible that expression would be observed in a fraction of the 
samples with a larger sampling. These results show that the TAG genes are expressed in 
a variety of cancers in addition to melanoma. 
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Table 4. Expression of TAG-1, TAG-2a, TAG-2b, and TAG-2c in Various Human 
Cancers 



Number and Percent of Cancers Expressing the TAG Genes 


uancer 


PCJR. Cycles 


TAG-1 


TAG-2a 


TAG-2b 


TAG-2c 


Lung 


30 


4/9 (44%) 


3/9 (33%) 


2/9(22%) 


2/9(22%) 




40 


6/9 (67%) 


7/9(78%) 


4/9(44%) 


4/9 (44%) 


Breast 


30 


1/8 (13%) 


1/8 (13%) 


1/8 (13%) 


1/8 (13%) 




40 


6/8 (75%) 


1/8 (13%) 


1/8 (13%) 


1/8 (13%) 


Ovarian 


30 


1/3 (33%) 


1/3 (33%) 


1/3 (33%) 


1/3 (33%) 




40 


2/3 (67%) 


2/3 (67%) 


2/3 (67%) 


1/3 (33%) 


Colon 


30 


2/5 (40%) 


2/5 (40%) 


0/5 (0%) 


0/5 (0%) 




40 


5/5 (100%) 


4/5 (80%) 


2/5 (40%) 


2/5 (40%) 


Brain 


30 


0/4 (0%) 


0/4 (0%) 


0/4 (0%) 


0/4 (0%) 




40 


3/4(75%) 


2/4 (50%) 


0/4(0%) 


2/4 (50%) 


Gastric 


30 


3/9 (33%) 


1/9(11%) 


1/9(11%) 


1/9(11%) 




40 


3/9 (33%) 


1/9(11%) 


1/9(11%) 


1/9(11%) 


Pharyngeal, 

Tongue, 

Cervical 


30 


0/3 (0%) 


0/3 (0%) 


0/3 (0%) 


0/3 (0%) 




40 


2/3 (67%) 


0/3 (0%) 


0/3 (0%) 


0/3 (0%) 


Prostate 


35 


4/8 (50%) 


1/8 (13%) 


1/8 (13%) 


0/8 (0%) 
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